Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing

نویسندگان

  • Zhong-Yuan Zhang
  • Chris Ding
  • Jie Tang
چکیده

NMF and PLSI are two state-of-the-art unsupervised learning models in data mining, and both are widely used in many applications. References have shown the equivalence between NMF and PLSI under some conditions. However, a new issue arises here: why can they result in different solutions since they are equivalent? or in other words, their algorithm differences are not studied intensively yet. In this note, we explicitly give the algorithm differences between PLSI and NMF. Importantly, we find that even if starting from the same initializations, NMF and PLSI may converge to different local solutions, and the differences between them are born in the additional constraints in PLSI though NMF and PLSI optimize the same objective function.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic Modeling via Nonnegative Matrix Factorization on Probability Simplex

One important goal of document modeling is to extract a set of informative topics from a text corpus and produce a reduced representation of each document. In this paper, we propose a novel algorithm for this task based on nonnegative matrix factorization on a probability simplex. We further extend our algorithm by removing global and generic information to produce more diverse and specific top...

متن کامل

Fast Parallel Randomized Algorithm for Nonnegative Matrix Factorization with KL Divergence for Large Sparse Datasets

Nonnegative Matrix Factorization (NMF) with Kullback-Leibler Divergence (NMF-KL) is one of the most significant NMF problems and equivalent to Probabilistic Latent Semantic Indexing (PLSI), which has been successfully applied in many applications. For sparse count data, a Poisson distribution and KL divergence provide sparse models and sparse representation, which describe the random variation ...

متن کامل

Nonnegative Matrix Factorizations for Clustering: A Survey

Recently there has been significant development in the use of non-negative matrix factorization (NMF) methods for various clustering tasks. NMF factorizes an input nonnegative matrix into two nonnegative matrices of lower rank. Although NMF can be used for conventional data analysis, the recent overwhelming interest in NMF is due to the newly discovered ability of NMF to solve challenging data ...

متن کامل

NMF-based Models for Tumor Clustering: A Systematic Comparison

Nonnegative Matrix Factorization (NMF) is one of the famous unsupervised learning models. In this paper, we give a short survey on NMF-related models, including K-means, Probabilistic Latent Semantic Indexing etc. and present a new Posterior Probabilistic Clustering model, and compare their numerical experimental results on five real microarray data. The results show that i) NMF using with K-L ...

متن کامل

On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing

Non-negative Matrix Factorization (NMF) and Probabilistic Latent Semantic Indexing (PLSI) have been successfully applied to document clustering recently. In this paper, we show that PLSI and NMF (with the I-divergence objective function) optimize the same objective function, although PLSI and NMF are different algorithms as verified by experiments. This provides a theoretical basis for a new hy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011